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In Q, we described the set of words that appear in the coding of smooth (resp. analytic) curves at 
arbitrary small scale. The aim of this paper is to compute the complexity of those languages. 
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1 Introduction 

A smooth curve is a map y from a compact interval / of the real line to the plane, which is C°° and such 
that 1 1/(0 1 1 > for any t £/ (this last property is called regularity). Any such curve can (and will be 
considered to) be arc-length reparametrised {i.e. \/t € /, ||/(f)|| = 1)- 

We can approximate such a curve by drawing a square grid of mesh h on the plane, and look at the 
sequence of squares that the curve meets. For a generic position of the grid, the curve y does not hit any 
corner and crosses the grid transversally, hence the curve passes from a square to a square that is located 
either right, up, left or ciown of it. We record this sequence of moves and define the cutting sequence of 
the curve y with respect to this grid as a word w on the alphabet {r,u,l,d} which tracks the lines of the 
grid crossed by the curve y. 

The following picture shows a curve y with cutting sequence rruuldrrrd. 




h 



Note that since the grid can be translated, a given curve may have more than one cutting sequence for a 
given mesh h. Our knowledge of the curve from one of its cutting sequences increases when the mesh 
h decreases, and when the mesh approaches 0, the local patterns of the cutting sequence play the role 
of discrete tangents. Such words are called tangent words, their first properties were described in Q. 
Cutting sequences associated to straight segments are known to be exactly the balanced words, which 
are also the finite factors of Sturmian words. It turns out that the tangent words strictly contain balanced 
words, and that 2-balanced words strictly contain tangent words. The aim of this note is to count the 
number of tangent words (resp. tangent analytic words) of a given length, in order to quantify those 
inclusions. 
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2 Tangent words 

Tangent words are the finite words that appear in the cutting sequences of some smooth curve for arbitrary 
small scale. More precisely, let F(y,G) denote the set of factors of the cutting sequence of the curve y 
with respect to the square grid G (when the curve hits a corner, the cutting sequence is not defined and 
we set F(y,G) = 0). We define the asymptotic language of /by 

r(y) = limsup F(y,G) = f| |J F(y,G). 

mesh{G)^0 e>0mesh(G)<e 

More generally, when X is a set of curves, let us denote by T(X) the set Uyex T(y). When X is the set 
of smooth curves, we denote T(X) by T°°, and call its elements tangent words. When X is the set of 
analytic curves, we denote T(X) by T a , and call its elements analytic tangent words. The two languages 
T°° and T w are factorial and extendable. 

For the sake of simplicity, we will focus on curves going right and up, i.e. smooth curves such that 
both coordinates of /(f) are positive for any t. Let us rename r and u by and 1 respectively to stick to 
the usual notation about binary words. 

The following results are proved in Q. 

2.1 Combinatorial characterisation (desubstitution) 

Balanced words are know to have a hierarchical structure, where the morphisms Co = (0 1— > 0, 1 h4 10) 
and cTi = (0i-7>01,1i-s>1) play a crucial role |8[ Q. The same renormalisation applies to tangent words. 
Given a finite word w, we can "desubstitute" it by 

• removing one per run of if 1 1 does not appear in w, or 

• removing one 1 per run of 1 if 00 does not appear in w. 

This desubstitution map (denoted by 8) consists in removing one letter per run of the non-isolated letter. 
An accelerated version of this desubstitution consists in removing a run equal to the length of the shortest 
inner run from any run of the non-isolated letter (including possible leading and trailing runs even if they 
have shorter length). 

If we repeat this process as much as possible, we get a derivated word denoted by d(w). The word w is 
balanced if, and only if, d(w) is the empty word, and the derivation process is related to the continued 
fraction development of the slope of the associated straight segment. 

A word is said to be diagonal if it is recognised by the following automaton with three states, which 
are all considered as initial and accepting: 








1 1 



A word is said to be thin diagonal if it is diagonal and only two states are visited during its recogni- 
tion. 

A word is said to be non-oscillating diagonal if it is recognised by the following automaton with 
eight states, which are all considered as initial and accepting: 
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Proposition 1 A finite word w is tangent if, and only if, d(w) is diagonal. 

A finite word w is tangent analytic if, and only if, d(w) is non-oscillating diagonal. 

For example, the word w = 100100010010010010001001000100 is tangent analytic since it can be 
desubstituted as lpeTipOD ip€T 1 PGT ip0t) ljOCT lpot) l^GT = 1 101 1 1 101 101, and then .HOHl 1QH0/ = 01 100 = 
d(w), which is non-oscillating diagonal (start from the bottom left state). 

2.2 Geometric characterisation 

Proposition 2 A word w is tangent if, and only if, for any e > 0, w is the cutting sequence of a smooth 
curve 7 which is e-close (for the C 1 norm) to a straight segment (the grid is fixed). 

A word w is tangent analytic if, and only if, for any £ > 0, w is the cutting sequence of a smooth curve y 
with nowhere zero curvature which is £-close (for the C 1 norm) to a straight segment (the grid is fixed). 

For example, the word 01 101001 10 is tangent and the word 10010101 10 is tangent analytic: 




0110100110 1001010110 
tangent tangent analytic 



3 Complexity 

The complexity of a language L is the map that counts, for any integer n, the number of elements of L of 
length n. It is usually denoted by p n (L). 

The complexity of the balanced words B was studied in J4|, (6l and Q, where it was proved to be equal 
to: 

p n (B) = 1 + £ £ <p(j) = 1 + f>- i + 1)9(0 > 

i=l 7=1 i=l 

where q> denotes the Euler totient function: (p(n) = card{k < n \ gcd(k,n) = 1}. 
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To compute the complexity of T°° and T m , we will use the tools introduced by Julien Cassaigne us- 
ing bispecial factors (3). They have been used in the context of billiards in [2]. Let L be a factorial and 
extendable language on the alphabet {0, 1}. A word w in L is said to be bispecial if Ow, Iw, wO, wl are 
in L. A bispecial factor w is called 

• weak bispecial if card{(a,b) € {0, l} 2 | awb G L} = 2, 

• ordinary bispecial if card{(a,b) £ {0, l} 2 | awb € L} = 3, 

• strong bispecial if card{(a,b) £ {0, l} 2 | awb £ L} = 4. 

Let wb n (L) (resp. sb n (L)) denote the number of weak (resp. strong) bispecial factors of length n in L. 
Let s n {L) denote the first difference p n+ \(L) — p n (L). We have: 



Let us first describe the combinatorial structure of bispecial factors in T°°. Let w be a bispecial factor. If 
w is not diagonal, then it can be desubstituted (in a single way) and 8 (w) is a bispecial factor of the same 
kind. Otherwise, if w is thin diagonal, then it is strong or ordinary bispecial depending on the parity of 
its length. Otherwise, w is diagonal and the three states are visited during its recognition: w is strong 
bispecial. Hence, there is no weak bispecial factor in T°°. This also holds for T a . 

The geometric characterisation of tangent (resp. tangent analytic) words is convenient to describe and 
count the strong bispecial factors. We can visualise the strong bispecial factors as follows. Pick a seg- 
ment from (0,0) to (p,q) £ Z 2 . 

If there is no integer point on the way (which happens precisely when gcd(p,q) = 1), the coding of the 
corresponding open interval is a bispecial factor of length p + q — 2 in both T°° and T w . Those words 
are also the bispecial factors for balanced words. There are (p(n + 2) such words of length n, this the 
geometrical meaning of Lipatov's formula B. 



s n+ i (L) - s n (L) = sb n (L) - wb n {L) . 



Hence, by summing twice, if L is nontrivial, we have: 



n-l (-1 



p n (L) = Z(sbj(L)-wbj(L)) . 



i=Q 7=0 
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Balanced bispecial factors of length 8 
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Otherwise, there are k > 1 points one the way. For tangent analytic words, each such segment corresponds 
to two bispecial factors of length p + q — 2: one bending above the k points, another bending under the k 
points. There are 2(n + 2 — (p(n + 2)) such words of length n. 




Tangent analytic bispecial factors of length 8 



For tangent words, each such segment corresponds to 2 bispecial factors of length p + q — 2 corre- 
sponding to all the possibilities of slaloming around the k integer points on the way. Hence, there are 
£ (p(n + 2)2 (n+2)/rf ~ 1 strong bispecial factors of length n in r°°. 

d\n+2 




Tangent bispecial factors of length 8 



Proposition 3 We have: 



Pn{ T") = l+n + ££(2j-(pU)-l) 



Pn{ T-) = l+n+ l -ti J L ( PU)2 i/d 

Z /=l;=2 d\j 
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4 Conclusion 

Let us recall that a word w is k-balanced if: 

Vm,v E Facf(w) |u| = |v| => ||w|i — |v|i| < k . 
Each class of words is strictly included in the next one: 

• 1 -balanced words (digital straight segments) 

• tangent analytic words 

• tangent words 

• 2-balanced words 

The complexity of the first two classes, is cubical whereas the complexity of the last two classes is 
exponential. It can be shown that analytic tangent words can be written as a concatenation of two 1- 
balanced words. What is the gap between tangent words and 2-balanced words ? 
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